Uncertainty-based learning of acoustic models from noisy data
نویسندگان
چکیده
منابع مشابه
Uncertainty-based learning of acoustic models from noisy data
We consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage remains limited to static model adaptation. We introduce a new Expectation Maximisation (EM) based technique, which we call uncertainty...
متن کاملIterative Concept Learning from Noisy Data Iterative Concept Learning from Noisy Data
In the present paper, we study iterative learning of indexable concept classes from noisy data. We distinguish between learning from positive data only and learning from positive and negative data; synonymously, learning from text and informant, respectively. Following 20], a noisy text (a noisy informant) for some target concept contains every correct data item innnitely often while in additio...
متن کاملLearning GP-trees from Noisy Data
We discuss the problem of model selection in Genetic Programming using the framework provided by Statistical Learning Theory, i.e. Vapnik-Chervonenkis theory (VC). We present empirical comparisons between classical statistical methods (AIC, BIC) for model selection and the Structural Risk Minimization method (based on VC-theory) for symbolic regression problems. Empirical comparisons of differe...
متن کاملA Learning to Rank from Noisy Data
Learning to Rank, which learns the ranking function from training data, has become an emerging research area in information retrieval and machine learning. Most existing work on learning to rank assumes that the training data is clean, which is, however, not always true. The ambiguity of query intent, the lack of domain knowledge, and the vague definition of relevance levels, all make it diffic...
متن کاملLearning From Noisy Singly-labeled Data
Supervised learning depends on annotated examples, which are taken to be the ground truth. But these labels often come from noisy crowdsourcing platforms, like Amazon Mechanical Turk. Practitioners typically collect multiple labels per example and aggregate the results to mitigate noise (the classic crowdsourcing problem). Given a fixed annotation budget and unlimited unlabeled data, redundant ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Speech & Language
سال: 2013
ISSN: 0885-2308
DOI: 10.1016/j.csl.2012.07.002